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ADAPTIVE TESTING FOR CONVERSION-RELATED 
ESTIMATES RELEVANT TO A NETWORK ACCESSIBLE SITE 



5 TECHNICAL FIELD 

The invention relates generally to processing test data that is 
relevant to a specific behavior of visitors at a network accessible site, such as 
a website available via the Internet, and more particularly to determining con- 
10 version rates of visitors to such sites. 

BACKGROUND ART 

With the widespread deployment of the global communications 
15 network referred to as the Internet, the capability of providing electronic 

service (e-service) has become important to even well-established traditional 
business entities. An "e-service" is an on-line service that markets goods or 
services, solves problems, or completes tasks. E-services are accessible on 
the Internet by entering a particular Uniform Resource Locator (URL) into a 

20 navigation program. 

Operators of e-services are often interested in inducing visitors 
of a website to act in a certain manner. For example, an operator (i.e., 
e-marketer) may be interested in the sale of goods or services to visitors or 
may merely request that visitors register by providing selected information. 

25 When a visitor acts in the desired manner, the event may be considered (and 
will be defined herein) as a "conversion." The ratio of visitors who are 
converted to the overall number of visitors is referred to as a "conversion 
rate." Presently, conversion rates at Internet websites are relatively low, 
typically in the range of two percent to four percent. 

30 For various reasons, managers of websites are interested in 

accurate measures of conversion rates. For example, a change in a con- 
version rate may be used as a measure of the effectiveness of a promotion. 
Promotional offers are often presented to visitors in order to induce the 
visitors to interact with the website in a desired manner, e.g., register or 

35 purchase a product. Promotional offers include providing a discount on the 
price of the product being sold, providing free shipping and handling of the 
product, and/or providing a cost-free item. The typical goal of a promotion 
campaign plan is to increase the conversion rate in a cost-efficient manner. 
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There are a number of considerations in determining estima- 
tions of conversion rate or other estimations of anticipated behavior by visitors 
to a network accessible site. On some occasions, there is available pre- 
testing information regarding the conversion rate of a website. There may be 
5 a relatively low or relatively high level of confidence in the accuracy of such 
information. Thus, one consideration is whether to incorporate the pre-testing 
information into the process of determining conversion rate. A second con- 
sideration is the selection of an approach for updating estimations. Yet 
another consideration involves selecting the sample size in testing visitors. 

10 Given the fact that each additional visitor that is tested causes a marketer to 
incur an additional cost and a potential loss in market opportunity, an impor- 
tant issue is determining how large the sample size needs to be in order to 
achieve a target level of confidence. A fourth consideration regards the 
methodology for sampling visitors for the testing. 

15 What is needed is a method and system which address these 

considerations in the estimations of anticipated visitor behavior. 

SUMMARY OF THE INVENTION 

20 An adaptive testing approach utilizes at least some of four 

components that are cooperative in providing behavioral estimations that 
satisfy a required level of confidence of accuracy. As a first component of a 
system or method, the process is configured to determine an initial estimation 
on a basis of pre-testing information. For example, an e-marketer's prior 

25 knowledge may be incorporated into an initial conversion rate estimation by 
characterizing the knowledge with a suitable probability distribution. A second 
component is configured to generate updates of the estimation in response to 
monitored behavior of visitors to a network accessible site, such as a website. 
In one approach, the second component utilizes Bayesian estimation to pro- 

30 vide updated estimations of subsequent visitor behavior. In third and fourth 
components, a minimum test sample size is determined while maintaining a 
target statistical confidence level. This determination is also adaptive, so that 
the measure of required test sample size is dynamically adjusted upwardly or 
downwardly in response to testing conditions. The third component uses 

35 systematic sampling. The fourth component is configured to utilize negative 
binomial sampling that is based on achieving the required confidence level. 

Within the first component, the prior knowledge of a manager of 
a site is entered and utilized. Ideally, the prior knowledge is an estimation of 
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the visitor behavior (e.g., conversion rate), with the estimation being within a 
target confidence interval. However, in some situations, the manager is not 
able to provide an estimation of the conversion rate. Instead, another type of 
information may be available. For example, the manager may specify a 
5 conversion rate mean and a standard deviation, so that parameters of a prior 
distribution of the conversion rate can be determined using Bayes inference. 
In another possibility, the manager may specify a range of the conversion rate 
by a confidence interval. Again, Bayes inference may be used to determine 
the parameters of the prior distribution. 
10 After observations of visitor behavior are obtained, a Bayes 

estimator may be used to provide automatic updates of the estimation of the 
conversion rate or other behavioral parameter of interest. In one embodi- 
ment, the point estimation is an average of the pre-test estimation and a 
maximum likelihood estimate that is a result of the observed behavior. Bayes 
15 estimation is especially useful if there is prior knowledge and only a small 

sample of observations, since a small sampling is susceptible to inaccuracies. 

Regarding the determination of sampling size, the target num- 
ber of successes (e.g., conversions) can be determined using systematic 
sampling at the third component. For example, from a probability criterion, a 
20 sample size may be identified as a ceiling. Then, from the expected number 
(N) of visitors, a requirement of the sampling pattern may be determined by 
dividing the expected number by the ceiling of the sampling size. A 
shortcoming of this systematic sampling approach is that there is a concern 
that the expected number of visitors will not be reached, so that the calculated 
25 test sample size will not be reached. 

In the fourth component, the shortcoming of the systematic 
sampling is addressed. Specifically, negative binomial sampling is utilized. 
The measure of the minimum test sample size therefore becomes dynam- 
ically adjustable by requiring the estimate of conversion rate to satisfy a 
30 particular statistical confidence level. The fourth module operates best in 
situations in which there may be a low number of visitors to a site. 

By integrating the four components, adaptive testing can 
intellectually and reliably address the main concerns of conversion estimation 
and testing. While the linkage of the first two components establishes the 
35 foundation for conversion rate estimation and updating, the linkage between 
the second and third components is a key to the dynamic sample size 
determination and allocation that provides managers with operational agility 
while maintaining targeted confidence. The linkage from the first and second 
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components to the third and fourth components completes the automatic 
process in such a way that it provides seamless adaptive testing for predicting 
visitor behavior. 

5 BRIEF DESCRIPTION OF THE DRAWINGS 

Fig. 1 is a schematic representation of an Internet-enabled 
system for implementing adaptive testing of behavior of a network site in 
accordance with the invention. 
10 Fig. 2 is a block diagram of modules and components for 

designing, testing and executing a promotion campaign plan within the 
system of Fig. 1, with a testing module in accordance with the invention. 

Fig. 3 is a process flow of steps for executing the invention. 

15 DETAILED DESCRIPTION 

With reference to Fig. 1, a number of clients 10, 12 and 14 are 
shown as being linked to a web server farm 16 via the global communications 
network referred to as the Internet 18. The web server farm may include a 

20 variety of conventional servers or may be a single server that interfaces with 
the clients via the Internet. The clients may be personal computers at the 
homes or businesses of potential customers of the operators of the web 
server farm, if the operation is an e-service for selling goods and/or services 
("products"). Alternatively, the clients 10, 12 and 14 may be other types of 

25 electronic devices for communicating with a business enterprise via a network 
such as the Internet. 

The tool to be described below is intended to optimize the 
increased value derived from conversions of customers when promotions are 
offered to the customers. However, the adaptive testing invention may be 

30 used in other applications in which conversions are of significance to opera- 
tors. A conversion is the act in which a visitor to a network site, such as a 
website, acts in a certain manner, such as purchasing a product or registering 
information. 

A campaign plan for determining which promotion should be 
35 presented to which customers is mathematically determined by an optimiza- 
tion engine 20. Information may be acquired using known techniques. A 
reporting and data mining component 22 receives inputs from a conventional 
web log 24, observation log 26, and transactional database 28. The logs 24 
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and 26 acquire information either directly or indirectly from tlie customers at 
the clients 10, 12 and 14. Indirect information includes the Internet Protocol 
(IP) address of the client device. As information is acquired, the IP address 
may be used to identify a particular customer or a particular geographic area 
5 in which the client device resides. The indirect information may be obtained 
from conventional "cookies." On the other hand, direct information is inten- 
tionally entered by the client. For example, the client may complete a ques- 
tionnaire form or may enter identification information in order to receive return 
information. 

10 The transactional database 28 is a storage component for the 

customer-related data. When a customer enters a particular transaction with 
a business enterprise that is the operator of the web server farm 16, billing 
information is acquired from the customer. The billing information is stored at 
the transactional database. As more transactions occur, a customer history 

15 may be maintained for determining purchasing tendencies regarding the 
individual customer. The various customer histories can then be used to 
deduce common purchasing tendencies, as well as common tendencies with 
regard to reacting to promotions, so that customer modeling may occur at 
the segmentation component 30 of the system. Customer segmentation is 

20 preferably based upon a number of factors, such as income, geographical 
location, profession, and product connection. Thus, if it is known that a 
particular customer previously purchased a specific product, the purchase 
may be used in the algorithmic determination of customer segments. 

A promotions component 32 includes all of the data regarding 

25 available promotions. The types of promotions are not critical to the Inven- 
tion. Promotions may be based upon discounts, may be based upon offering 
add-on items in the purchase of a larger scale item, may be based upon offer- 
ing future preferential treatment (e.g., a "gold member") or may be based 
upon other factors (e.g., free shipping and handling). 

30 A test marketing module 34 is the focus of the invention. The 

test marketing module may be used to determine a conversion rate which 
provides an estimate for predicting future customer behavior. For example, 
the estimate of conversion rate may be used to forecast product procurement 
needs. That is, the purchase of inventory may be at least partially based 

35 upon the estimate of the conversion rate. 

Interaction with the design of a promotion campaign plan by a 
business manager takes place via a workstation 36. The business manager 
may enter information regarding parameters such as budget constraints. 
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business objectives, costs and revenues. The budget constraints may relate 
to different stages of the process, so that there are specific budget constraints 
for the test marketing stage. 

Fig. 2 illustrates the four stages of a promotion campaign plan. 
5 In a first stage 38, an initial campaign is defined. The defined campaign is 
passed to a stage 40 for the testing process that is the focus of the invention. 
It is at this stage that the invention is implemented. 

The test results of an initial campaign model are passed from 
the test stage 40 to an optimization stage 42. It is at this stage that the dif- 

10 ferential allocation of promotions is determined for the different customer 
segments. The optimized campaign plan is then passed to an execution 
stage 44. This execution stage interacts with storefront software 46, such as 
that offered by Broadvision of Los Altos, California. The storefront 46 may be 
run on the web servers of the farm 16 of Fig. 1, so that clients 10, 12 and 14 

15 may link with the system using conventional techniques, such as an Internet 
navigator. While the invention is described with respect to the interaction 
among the four stages, the test stage 40 that is the focus of the invention may 
be used in other architectures. 

A number of actions take place within the campaign definition 

20 stage 38. Necessary information is retrieved from a data warehouse 48. 
One source of information for the data warehouse is the connection to the 
storefront 46. This connection allows the transactions with customers to be 
monitored. As relevant information is recognized, the information is stored. 
This information can then be used to define the customer segments, as 

25 indicated at component 50. Within the campaign definition stage 38, the 

promotions are defined 52 and the tests for ascertaining the effectiveness of 
the promotions are also defined 54. Thus, the initial model of the campaign 
can be created 56. This initial campaign plan is stored at a campaign data- 
base 58. 

30 Within the testing stage 40, the tests that are defined within the 

component 54 of the campaign definition stage 38 are executed. As will be 
described more fully below, the testing stage is a system module that includes 
four cooperative components. As a first component 60 of the module, a man- 
ager of the system may incorporate prior knowledge into an initial conversion 

35 rate estimation. In one embodiment, the prior knowledge is incorporated by 
characterizing the knowledge with a suitable probability distribution. In a 
second component 61, an updating algorithm is used to automatically update 
the conversion rate estimation as a response to monitoring behavior of 



Docket No. 10007932-1 



-7- 

customers. Preferably, Bayesian estimation updating techniques are 
employed. In a third component 62, systematic sampling is employed to 
determine the minimum test sample size of customers for a given accuracy 
confidence level. The concern is that the actual number of customers will fall 
5 below the expected number, so that systematic sampling will be flawed in 
some applications. Therefore, a fourth component 63 incorporates negative 
binomial sampling for those occasions in which the customer count is low. 
The applicable algorithms will be set forth in detail in sections that follow. 

The optimization stage 42 includes defining optimization objec- 

10 tives 64 (i.e., business objectives) and optimization constraints 66, so that an 
optimized campaign can be identified at component 68 of the stage. The 
resulting plan is stored at the campaign database 58 and is transferred to the 
execution stage 44. 

As previously noted, the execution of the optimized plan utilizes 

15 the storefront 46. Preferably, in addition to the execution component 70, the 
stage 44 includes a capability 72 of monitoring and reoptimizing the plan. 
Thus, interactions with customers are monitored to recognize changes in 
dynamics which affect the campaign plan. The reoptimization is a reconfigur- 
ation that is communicated to the campaign database 58. 

20 

DETAILS REGARDING THE TESTING STAGE 

A number of assumptions will be made in the description of the 
testing stage 40. Firstly, it will be assumed that the goal of this phase is to 

25 provide an accurate prediction of future customer behavior. Typically, this 
prediction is based upon an accurate calculation of the conversion rate. In 
achieving this goal, generating profit within this stage is not an issue. Never- 
theless, it is assumed that there is a testing budget. There may be an overall 
testing budget for the stage and individual budgets for the different customer 

30 segments defined in the component 50 of the campaign definition stage 38. 
It will also be assumed that the overall testing duration is reasonably set forth. 

A "combination" will be defined herein as a segment-promotion 
pair. That is, each combination includes one customer segment that was 
defined in component 50 and one promotion that was defined in component 

35 52. Different combinations can have different deterministic/stochastic 

conversion rates. In the description that is to follow, in some situations it will 
be assumed that the different conversion rates are not correlated, so that the 
combinations will be treated separately. In other situations it will be assumed 
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that the conversion rates for the different combinations are correlated, so that 
they are dealt with jointly by establishing a correlated structure. Empirical 
Bayesian approaches may be developed in situations where it is assumed 
that the conversion rates are correlated. 
5 Another assumption is that behaviors of visitors in the same 

combination are independent of each other and the individual conversion 
status (Y|) for each visitor has a binomial distribution B(1 ,6), where 0 is deter- 
ministic and is determined by the underlying mean customer characteristics 
and promotion attribute levels. In this binomial distribution, 1 is the number of 

10 trials for the individual visitors and 0 is the conversion rate. In a first alterna- 
tive assumption, customer segmentation is assumed to be perfect as far as 
conversion rate is concerned. Frequentlst's statistical inference approaches 
will be developed under this model. As a second alternative assumption, it 
will be assumed that the customer segmentation is not perfect. Conse- 

15 quently, in addition to the variability of the binomial distribution, there is an 
added variable of the imperfect customer segmentation. Mathematically 
expressed YjjO ~ B(1,0), and 0 ~ G(0,ri), where G captures the additional 
variability, Bayesian approaches are developed under this model. 

20 A. INCORPORATION OF PRIOR KNOWLEDGE 

As previously described, the first component 60 of the testing 
stage 40 is a component of a testing module that allows a manager to enter 
previously acquired information relevant to determining the conversion rate of 

25 a combination. Referring to the process flow of steps of Fig. 3, the step 76 is 
one in which the prior knowledge is incorporated into the determination of 
visitor behavior. The invention will be described in the implementation in 
which conversion rate is the target measure of customer behavior. 

It is possible that the prior knowledge that is incorporated at 

30 step 76 is a previously acquired sampling of visitors with regard to a particular 
segment-promotion combination. If the sampling is designated as Y and 
includes n visitors, then Y = {Y^, Yj, YJ. If the assumptions are that the 
customer segmentation is perfect and the correlation rates among the dif- 
ferent combinations are not correlated, classic point estimation may be used 

35 to determine a point estimate (0) for the conversion rate of the combination. 
The point estimate is equal to Y, so that: 
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e = Y = (Eqn. 1) 

i=i 

This point estimate for the conversion rate is the maximum likelihood estimate 
(MLE) under the two assumptions. That is: 

Y = argmaXgUeiY) (Eqn. 2) 

where the joint likelihood function is: 

(Eqn. 3) 

i=i 

Confidence intervals of the conversion rate 0 may also be 
identified. A confidence interval of 0 has a lower limit 0|^and an upper limit 
0^. Thus, for a given confidence level (1-a), the probability function is such 
that: 

P(eL<e<0u) = 1-a (Eqn. 4) 

where the two limits are functions of the observations Y. For example, a 
confidence interval of 95 percent means that if the experiment is repeated 
under the same conditions 100 times, within 95 of those times the resulting 
interval is expected to contain the true conversion rate. 

As is known in the art of statistical economics, the confidence 
interval of confidence level 1-a is: 



C 0-Zi_„/2S.e.(0), 0+z^_„/2S.e.(e)/ (Eqn. 5) 

where the z value of Zi_„/2 is obtainable from a standard table and where 
s.e.(0)is the standard error of the point estimate (0), with : 



s.e.(0) = \/0(1-0)/n 



(Eqn. 6) 
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Rather than the normal approximation of the confidence interval, 
a more exact determination can be made. For the sample Y having a sample 
size of n and having y conversions, the upper and lower limits of the con- 
fidence interval can be determined by the following equations: 



y + (n-y + 1)F2(,_y.i),2y(1-a/2) 



y + 1+(n-y)F2(y.i),2(n-y)(1-C'/2) 



(Eqn. 7) 



(Eqn. 8) 



Thus far, the implementation of step 76 of Fig. 3 has been 
described with the assumption that customer segmentation is perfect. 
However, if this assumption is not used, Bayesian approaches may be more 
advantageous. In one possible scenario, an e-marketing manager may not 
be able to provide a reasonable estimator for the underlying conversion rate, 
but may be able to specify some other type information. For example, the 
manager may be able to provide a mean Qq for the conversion rate and may 
be able identify a standard deviation (oq). From these specifications, 
parameters of a prior distribution on 0 may be calculated using Bayes 
inference. One choice for the prior is the Beta (a, p) distribution. The 
algorithms for computing a and |3 are as follows: 



V (Eqn. J 



(Eqn. 10) 



On the other hand, the e-marketing manager may be able to specify the 
range of the conversion rate by a confidence interval (x.,, Xg). That is, for a 
given confidence level 1-a, the probability specification is: 
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P(Xi<0<X2) = 1 -a (Eqn. 11) 

From this, the required parameters for the Beta distribution may be calculated 
as follows: 



2 



0 



(Eqn. 12) 



(Eqn. 13) 



(Eqn. 14) 



(Eqn. 15) 



After the parameters are calculated, Bayesian estimation may 
be used to compute a point estimation. The techniques will be described in 
the following section, since the Bayesian estimation may also be used in the 
25 updates of the conversion rate calculation as testing is implemented. 

B. BAYESIAN UPDATE ESTIMATION 

Referring to Figs. 2 and 3, in step 78, observations are obtained 
30 during the testing process, so that an updated estimation of the conversion 
rate can be obtained. The expected value (E) for 0, given Y, is as follows: 



E(e I Y) 4 4 ^ 

^ a + p+n^ d + p ^ a+p+n^ " 



S" Yi 

(Eqn. 16) 



The parameters a and pmay be determined using Eqns. 14, 15 and 16. As 
can be appreciated, the expected value is the weighted average of the 
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estimation of conversion rate based upon the prior l^nowledge of step 76 (i.e., 
a/(d + p)and the iVlLE estimator that is based upon the testing without any 
prior knowledge (i.e., Eqn. 1). This Bayes estimation is especially useful if we 
have prior knowledge and only a small number of observations (n is small). 

5 As one example, given the current knowledge of on-line conversions, if a 

sampling of (1 , 0, 1 , 1) is observed, it typically is safe to estimate that the true 
conversion rate is much lower than the MLE estimate of 0.75. Such a high 
MLE estimate may be regarded as an occurrence largely due to chance. 

If no prior knowledge is available, the parameters may be 

10 estimated using empirical Bayes analysis, which will not be described in detail 
in this document. 

The update of the estimate of conversion is represented by 
step 80 in Fig. 3. While not critical, the update is preferably executed on a 
recurring basis. Thus, a conversion rate estimation is automatically updated 

15 as the behaviors of more customers are observed. As previously noted, one 
issue is determining the minimum required sample size in testing of the cus- 
tomers. In accordance with an aspect of the invention which will be described 
immediately below, the process provides regularly updated minimum sample 
size determinations. According to this aspect, the system is able to intel- 

20 ligently and promptly either reduce the unneeded size allocated previously in 
order to save testing cost and time or increase the required sample size in 
order to ensure that the required confidence level is reached in the final 
update. 

25 C. SYSTEMATIC SAMPLING 

Step 82 in Fig. 3 represents the components 62 and 63 in the 
testing stage 40 of Fig. 2. The systematic sampling works well in applications 
in which there is a high and predictable number of visitors in each customer 

30 segment-promotion pair that defines a combination. On the other hand, 
negative binomial sampling is better suited for applications in which the 
number of visitors is either predictably low or unpredictable. Preferably, the 
system includes both sampling approaches. Therefore, the negative binomial 
sampling may be utilized until a threshold number of samples is acquired, 

35 after which the systematic sampling may be activated. 

Within the systematic sampling, a probability criterion is 
proposed and it is assumed that behaviors of visitors within each combina- 
tion are independent of each other and that the conversion rates among 
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combinations are not correlated. Regarding the probability criterion, for a 
given confidence level (1-a) and for an upper bound (d) of tine distance 
between the estimate G^and the true value of 6 (i.e., |e„-e|) the probability 
criterion is: 

5 

P(l0„-e|<8)^1-cc (Eqn. 17) 

where e is the sampling precision. Therefore, the derived sample size is: 

10 

n* = ceiling I ^^]'] (Eqn. 18) 

where a=a{Y^)=^/Q(J^ "s the standard deviation of a single conversion 
15 variable. 

If the total number of expected visitors in a particular combina- 
tion is N, the systematic sampling scheme is to sample n* visitors from N total 
visitors. Taking d = floor [N/n*], the scheme is to generate a random start s 
from integers {1, 2, d}, and make offers to visitors s, s+d, s+2d, 
20 s+(n*-1)d. 

As previously noted, the shortcoming of the systematic sampling 
scheme is that the required sample size may only be reached if and when the 
N total visitors of a particular combination have visited the website. 

25 D. NEGATIVE BINOMIAL SAMPLING 

It is supposed that the sequentially observed conversions Y^, Y^, 
Yn are identically and independently distributed (i.e., i.i.d.~B(1, 6)), where 
0 is the conversion rate. When 6 is small, as is experienced in current on-line 
30 conversion applications, the negative binomial sampling (also referred to as 
inverse binomial sampling) provides a faster solution than the random 
sampling and systematic sampling techniques. If m is the number of con- 
versions that are determined to be needed and T is the total number of trails 
needed, then T has a negative binomial distribution. 

35 

P(T=t) = [^"^^^"'q*""' (Eqn. 19) 
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where q = 1-0. This negative binomial distribution applies for t = m, m+1, ... . 
This can be denoted by Y ~ NB(m, 0). As with any other sequential sampling 
scheme, the stopping point (or sample size) of NB sampling depends upon 
the actual data. Specifically, 




(Eqn. 20) 



After m is determined, the implementation of T is straightforward. The 
10 expected value of T is m/0 < so that negative binomial sampling neces- 
sarily terminates. From this property it can be seen that T/m is an unbiased 
estimator for 1/0. In fact, T/m is the uniform minimum variance unbiased 
(UMVU) estimator of 1/0. Based upon this, it can also been seen that 
0 = m/T is an estimator for 0. 
15 Thus, it is important to determine the number of conversions m. 

The following probability criterion is used in the determination of the required 
success number m for a given precision level e and for a given confidence 
level 1-a: 



p{li-|H|}.1-a (Eqn.21, 

The rationale of this probability criterion will become clearer from the explana- 
tion that follows. 
25 For the decomposition: 

T = E Tj (Eqn. 22) 



where T^, T2, T^^, i.i.d. ~ g(0), which is the geometric distribution with the 
success parameter 0. With the central limit theorem, the following 
approximation can be determined for Eqn. 21: 



Since a^{T^) = {^-Q)IQ^, Eqn. 23 can be simplified to: 
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^>2 



= Zi-a/2 (Eqn. 24) 

y/1-e 



For any implementation, the ceiling of the answer is used in order to obtain 
the required integer. Therefore, the choice of m is: 

= ^fl-^j\l-e) (Eqn. 25) 

The random event of the probability criterion of Eqn. 21 can be expressed in 
terms of the difference between 6 and its estimator mA". The random event is 
equivalent to the following: 

^.-I.lli (Eqn. 26) 

e m 0 V H / 

It is then, of course, true that: 

-^.m-e.^ (Eqn. 27) 

1 +8 T 1 -e 

Setting the lower limit of the precision level = -80/(1+8), and setting the 
upper bound of the precision level = e0/(1-e), then the criterion of Eqn. 21 
becomes: 



p|e^^^-0ie^|i1-a (Eqn. 28) 

The probability criterion now becomes more intuitive for the purpose of 
30 estimating 0 by the estimator m/T. Another value of this last equation is that 
the manager can readily specify the precision level in estimating 0, expressed 
in terms of either or Ey. For example, once 8u is specified, then: 



(Eqn. 29) 

0+8,, 
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= - — ^ (Eqn. 30) 

5 Inserting these expressions of e into Eqn. 25, the calculations of m* become: 
m* = zL/20-e)(l-^-]' (Eqn. 31) 

V 8,,/ 



= zf_„,2(1-e)(l-^f )' (Eqn. 32) 

It should be noted that the confidence interval of 0 in the 

15 probability criterion is not symmetrical about 0, since it is generally true that 
the absolute of 8l is not equal to the absolute of 8u. This is different from the 
traditional statistics inference approach. In fact, it is believed that the 
generalization to asymmetric confidence intervals has its advantage in the 
conversion rate context, since conversion rates are not regarded as having 

20 the same weight in reality. However, despite the asymmetric confidence 
interval, the formulas for computing m from either 8l or ey are the same. 

Using the negative binomial sampling approach, inputs may 
include (1) the conversion number m, (2) the lower precision level El, (3) the 
upper precision level 8^, (4) the confidence level 1-a, and (5) the estimated 

25 conversion rate 0. If item (1) is specified, then we just keep observing until 
the actual success number reaches m. However, it is difficult to specify m 
without any prior knowledge. Therefore, specifying items (2)/(3), (4) and (5) is 
required in order to compute for m. Note that we only need one of (2) and (3), 
not both. The desired outputs are the conversion rate point estimate and the 

30 conversion rate confidence estimate. 

After the total number m of conversions are detected during the 
testing stage 40 of Fig. 2, the conversion rate point estimate is computed on 
the basis that 0 = m/T. Fig. 3 includes a decision step 84 of determining 
whether a final required sample size has been reached. That is, a recalcula- 

35 tion of the sample size occurs if the conversion number m is reached but the 
confidence requirements are not satisfied. If the confidence requirements 
are satisfied, the process ends, but if the sample size is renumbered, the 
process returns to the step 78 of obtaining observations. This process will be 
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described in the next section. Briefly stated, upon clnecking on the attained 
confidence level length, it is determined whether additional sampling is 
needed. 

5 E. CUSTOMER ALLOCATION, SEQUENTIAL TESTING 
AND TERMINATION 

During the testing stage, there are number of considerations 
that must be addressed. One consideration is the technique for allocating 

10 promotions to arriving visitors within a particular customer segment. In some 
situations, the assumed conversion rates are not informative. For example, 
an e-marketer may not be able to provide any relevant information. In such 
situations, there are advantages to allocating all promotions to the arriving 
visitors alternately during the testing stage. Thus, if there are two promotions, 

15 the odd numbered visits result in presentation of the first promotion, while the 
even numbered visits trigger presentation of the second promotion. This 
achieves some randomization effect which can reduce unaware biases. On 
the other hand, if informative inputs on the assumed conversion rates are 
available for the different combinations, a proportional sampling scheme may 

20 be implemented in the allocation approach. 

For each combination of a customer segment and a promotion, 
an index (c) can be assigned on the basis of c = (1, k), where i is the customer 
segment and k is the promotion. Upon reaching a closing time (t), an attained 
confidence interval length D(c, t) is computed. Then, termination occurs for 

25 those c's in which the "convergence" of D(c, t) has been reached. One 

termination criterion for convergence is whether the variable moving average 
reaches a threshold. That is, termination occurs for those combinations that 
satisfy: 

30 

{SD(c,i)/t}/0t < 1-a (Eqn. 33) 

i=i 

where 1-a is a prespecified stabilization confidence level. For those 
35 combinations that do not satisfy the criterion, if there is no promotion budget 
problem for continuing the sampling until t+1 , the process runs for those 
combinations until t+1. On the other hand, if there are promotional budget 
problems, then ranking may occur for ail remaining D(c) = D(c, t), with proper 
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aggregation over t. After the ranking, the combinations with the lowest D 
values are terminated until there are no longer any promotional budget 
concerns. For the rest, the sampling continues to run until time t+1 , 
whereafter the criteria of Eqn. 33 is again applied. 

Upon the termination of any combination, the resulting conver- 
sion rate is used to reevaluate the sample size requirement for additional 
sampling needs. In some occasions, the termination period will be reestab- 
lished. As a note, it may be beneficial to store all raw conversion data in a log 
for subsequent use. 



35 



